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The appropriate choice of the genotype 



phenotype mapping in combination with the 
mutation operator is important for a success- 
ful evolutionary search process. We suggest a 
measure to quantify the quality of this com- 
bination by addressing the question whether 
the relation among distances is carried over 
from one space to the other. Search pro- 
cesses which do not destroy the neighbour- 
hood structure are termed strongly causal. 
We apply the proposed measure to parameter 
and structure optimisation problems in order 
to assess the combination (mapping, muta- 
tion operator) and at the same time to be 
able to propose improved settings. 



1 Introduction 

The optimisation process in evolutionary algorithms is 
largely influenced by the mapping from the genotype 
space to the phenotype space. Especially for structure 
optimisation problems a measure of the quality of the 
combination (mapping, mutation, crossover) would be 
desirable. In this paper we propose such a measure 
based upon the observation that Darwinian evolution 
takes gradual changes to the optimum, although in 
biological evolution other phenomena like punctuated 
equilibria are also observed. 

We demand that the search process is locally strongly 
causal with respect to the mutation operator, that is: 
small variations on the genotype space due to mutation 
imply small variations in the phenotype space. This 
way the neighbourhood structure under the mapping 
G ^ V is conserved, see Figure [p. The distance on 
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the genotype space is defined via the mutation prob- 
ability. The need for a strong causal exploration of 



the search space has been expressed before ( Rechen 
berg 1994; Lohmann 1993[ ). However, in the following 



we want to quantify the degree to which the setting 
(mapping, mutation operator) satisfies the causality 
condition. 

The distance measure and therefore the causality con- 
dition in section ^| only depends on the mutation and 
not on the crossover operator. This does not represent 
any opinion whether one or the other is the driving 
force in evolutionary algorithms. However, we believe 
that the mutation operator usually is responsible for 
small steps in the phenotype space, hence for gradual 
changes which we want to analyse. Furthermore, we 
assume a locally smooth fitness function and define 
conditions for the genotype — > phenotype mapping for 
this probl em domain. Thus, unlike in correlation base d 

Manderick et al. 199l| ), 



analysis, (Jones et al. 1995 



we do not explicitly refer to a fitness landscape, in- 
stead we focus on the conservation of neighbourhood 
structures. 

In the next section we will propose a condition for a 
strongly causal search process and quantify it by intro- 
ducing a probabilistic interpretation of the condition. 
Section || presents a first application in the domain 
of parameter optimisation problems and the following 
section is concerned with the structure optimisation of 
neural networks, where complicated genotype — > phe- 
notype mappings are commonly used. 



2 A Condition for Causality 

In section [j] we claim that for the successful intro- 
duction of new information by mutation the mutation 
operator should preserve the neighbourhood structure 
in the corresponding evolutionary spaces. We believe 
that strong causality is necessary in evolutionary algo- 
rithms 
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Figure 1: Examples for strongly, weakly and non 
causal genotype-phenotype mappings under the influ- 
ence of mutation. Circles denote genotypes (gi, gj,gk), 
gj and gt are results of mutations from gi. The corre- 
sponding phenotypes (pi,Pj,Pk) are shown as squares. 
The first strongly causal mapping does not destroy the 
neighbourhood structure in genotype space, the sec- 
ond weakly causal mapping, maps small mutations in 
genotype space to large distances in phenotype space 
and vice versa. The last example shows a non-causal 
mapping. 

• to allow for controlled small steps in the pheno- 
type space which are provoked by small steps in 
the genotype space. Especially in the vicinity of 
an optimum we need small steps to gradually ap- 
proach the optimum. 

• for the ability of self-adaptation of any strategy 
parameters, since with the lack of strong causality 
the information about the past is meaningless and 
adaptation is impossible. 

In order to formulate the causality condition we have 
to define the term small variation in a mathematical 
sense. Therefore, we introduce a measure of distance 
in the genotype and phenotype space. For the mathe- 
matical correctness we have to show that the measure 
in the respective spaces endows these spaces with a 
metric. For distances in the genotype space we pro- 
pose a "universal" measure which is based on the prob- 
ability of reaching genotype gj from genotype gi. In 
this respect it resembles definitions of distance used in 
evolutionary biology, (Schuster 1995a): ". . . the notion 
of distance in genotype space is given by the smallest 
number of individual mutations required for the inter- 



conversion of two genotypes . . . ". Furthermore, this 
measure is general enough to be applicable to a wide 
range of evolutionary algorithms. 
We introduce the following notations: Genotype space 
Q = {gi} and phenotype space V = {pi}. Both Q and 
V can also be continuous spaces. The mapping be- 
tween the spaces is / : Q i— ► V, thus pi = f(gi). The 
operators mutation and crossover act upon the space 
Q, the selection operator acts upon the fitness space 
T and therefore on V . We parameterise the mutation 
operator by a real valued vector a € IR l . 

Now, we will introduce the definition of distance on Q, 

based on the mutation probability P(gi—>gj) of reach- 
ing gj from g± in Q via mutation which is characterised 
by a. 
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This definition is only sensible if we claim that P{gi —> 
gj) < Pid and that the probability not to mutate is in- 
dependent of g, which is satisfied by most evolutionary 
algorithms^. The logarithm in eq. ([!]) is introduced in 
order to make the distance measure additive instead 
of multiplica tive. The properties of this measure are 
discussed in ( Scndhoff ct al. 1997 ). 



Eq. (|l]) allows for the comparison between different 
EAs independent of any particular metric on the geno- 
type space, like Hamming distance or Euclidian dis- 
tance. 

Now, we can proceed with the definition of causality. 
Condition: Strong causality 

V9i>9j,9k 3cr',£ with aeU s (a') 

\\f(9i)-f(9j)\\<\\f(9i)-f(9k)\\ 
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The additional condition that a can be drawn from 
anywhere inside a sphere with radius e (e can be suf- 
ficiently small) around a 1 guarantees that the effect of 
mutation continuously varies with a . That is, besides 
the existence of an appropriate a, we have to guarantee 
that it is possible to locate. Mathematically, the space 



1 In GAs P(gi — > gj) < Pid corresponds to a mutation 
rate p<0.5(p = 0.5 leads to random initialisation) and in 
ES to normally distributed mutations with zero mean. 



of all mutation parameters which satisfy the causal- 
ity condition is not empty and additionally not of 
measure zero. 

We have indicated, that our analysis is concerned with 
the local behaviour of evolutionary search. Therefore, 
condition (||) should not be seen as a global condition. 
The term local is difficult to define. However, an ab- 
solute measure of locality is not necessary since we are 
interested in the relative performance of EAs. 

Condition (||) defines strong causality in both direc- 
tions. Small distances and variations on the phenotype 
space imply small distances and variations in the geno- 
type space with respect to the probability of jumping 
this distance via mutation and vice versa. However, 
in most EAs the second direction is more important. 
That is, small variations in the genome provoke small 
variation in the phenotype. 

So far we have only set up a qualitative condition for 
strong causality. In order to compare between EAs, 
we have to find a quantitative version of condition (||). 
We will rephrase it in the light of a probabilistic inter- 
pretation. 

Assuming the <?; , gj , gk to be random variables with 
uniform distribution, both sides of condition (^) be- 
come boolean random variables. As a shortcut, we 
introduce the symbols A and B: 



A := ||/(ft)-/G& 




B 



Since we assume the distribution of gi,gj,gk to be 
known, we can derive the probabilities P(A), P(B), 
and P(A,B). We can now, with the help of Bayes' 
law, recast the two directions (genotype <-> phenotype) 
in the following way: 

Probabilistic condition: Strong causality 
Vgi,9j,9k Ba',e with a G U E (a') 



Q=>V: P(A\B) 
V^Q: P(B\A) 



P(A,B) 
P(B) 

P(A,B) 
P{A) 



1 (6) 
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The value of P(A\B) serves as a quantitative measure 
for the causality in EAs. If the neighbourhood rela- 
tions in both spaces are uncorrelated for every point, 
then the system is weakly but not strongly causal 
P(A,B) = P{A)-P{B) and therefore P(A\B) = P(A), 



P(B\A) = P(B), thus distance relations in the pheno- 
type space are statistically independent from distance 
relations in the genotype space and vice versa 2 . One 
example for such systems is the class of Monte Carlo 
algorithms where the transition probability between 
any pair of genotypes is constant. For constant tran- 
sition probabilities, B in equation (|^) is constant for 
all genotype combinations and does therefore not pro- 
vide any information about the distance relation in the 
phenotype space. 

In evolutionary molecular biology measures similar to 
this probabilistic formulation of the causality condi- 
tion are employed in the contex t of the analysis of the 
"sequence-structure" mapping (Schuster 1995b). 



3 Parameter Optimisation 

Firstly, we employ one of the mainstream paradigms 
of EAs - the evolution strategy (ES) and show that 
the ES is strongly causal in terms of our proposed 
condition. As an example for an EA, which violates 
the causality condition, we analyse the canonical ge- 
netic algorithm (GA) applied to parameter optimisa- 
tion. We propose a new mutation operator for the GA 
which observes strong causality to a greater extent and 
show that this also increases the performance. 

3.1 Evolution Strategy 

We firstly focus on the transition probability. In the 
canonical ES Q = V = M n and the genotype — > phe- 
notype mapping is the identity / : Q — ► V = idjj™ . It 
uses normally distributed mutation steps which are in- 
dependent of the genotype gi G Q. That is, the transi- 
tion gj gk is defined by adding a normally distributed 
number z = (zi, . . . , z n ) with Zi ~ N(0, a 2 ). Hence, 
the pdf of this transition can be expressed in terms of 
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Inserting the transition pdf in the causality condition 
(||) with / : Q —>■ V = idjjn results in 

\\f(9i) ~ f(9j)\\ < \\f(9i)-f(9k)\\ 



2 Whether the system is non-causal in the sense of being 
non-deterministic is not determined by eqs. Q^Q), since 
we do not observe whether the mapping from genotype to 
phenotype space itself is probabilistic or not. 



hi - 9j\\ < hi - 9k\\ 

exp (-\\gi - gj\\ 2 ) > exp (-\\ gi - g k \\ 2 ) 
p{9i^9j) > P{gi^9k) (11) 
which holds for all combinations of gi, gj, gk, cr. 

The examination of the metric conditions of the dis- 
tance measure of an ES and some notes on the self- 
adaptation of a are presented in ( Sendhoff et aL 1997| ). 



3.2 Genetic Algorithms 

In the case of genetic algorithms (GA) the genotype 
space consists of binary strings of length L, therefore 
Q = {0, 1} L . Canonical GAs mutate by changing each 
bit position from — > 1 and 1 — > 0, respectively, with 
the probability p m . Thus, p m corresponds to the mu- 
tation parameter a. Let hij denote the Hamming dis- 
tance between gi and gj . 

In order to examine the causality condition we use 
the Euclidian metric on the phenotype space V and 
choose the standard binary coding for the genotype- 
phcnotype mapping / : Q — > V. Using the following 
notations 
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the causality condition is expressed as 
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Assuming p m < 0.5 the right hand side of (|T^) can 
be expressed as hij < h ik . Therefore the causality 
condition reads 



L-l 



^2(x i (n)-x j (n))2 n 
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n=0 
L-l 



<S=^> \xi(n)-Xj(n)\ < \xj(n)-x k (n)\ (17) 
n=0 n=0 

which obviously does not hold in general, not even 
locally. 

As a measure of the extent to which the GA satis- 
fies the causality condition we employ the probabilistic 



version of the condition. After some extensive calcula- 
tions which are presented in ( Sendhoff ct al. 1997] ) we 
get P(A\B) fa 0.51 and P(B\A) « 0.62. That is, the 
chance of a small mutation of a genotype resulting in a 
small change of the corresponding phenotype is about 
51%. The probability that a small change of a pheno- 
type is caused by a small mutation of the genotype is 
somewhat higher, about 62%. Thus, in the case of a 
canonical GA, the mapping from the genotype to the 
phenotype is not strongly causal. In our opinion the 
combination of binary coding and point mutation is 
not well suited for continuous parameter optimisation 
combined with locally smooth fitness functions and is 
the reason why ES, which observes strong causality, 
outperforms the GA in most cases in this problem do- 
main. 

3.3 A New Mutation Operator 

We have seen that in GA the standard mutation oper- 
ator together with the binary encoding does not satisfy 
the causality condition in general. Possible solutions 
to this problem are to use a different encoding scheme, 
e.g. the Gray codeQ, to change the mutation operator 
and keep the encoding scheme, and to change both. 

In the remainder of this section we will partly out- 



line an approach, presented in detail in (Sendhofl 



t al. 1996), which sticks to the concept of point mu- 



tation, but uses a position dependent mutation rate 
Pm = Pm{i)- This will provide us with an interesting 
example of an EA, where a modification of the muta- 
tion operator enhances the causality and, as we will 
see, also the performance. 



p(i) 




Figure 2: Top: p m (i) for a — 10.0 (dashed curve 
- numerical approximation), bottom: p m (i) (rescaled 
with a factor 8) for a = 0.5 (dashed curve - numerical 
approximation) . 

As we have seen above, the ES is a strongly causal 
optimisation procedure. Therefore, we translate the 
concept of mutation by adding normally distributed 



3 Although we show in ( ^endhoff et al. 1997| ) that the 
Gray code does not increase the causality substantially. 



numbers in ES to point mutation in GAs. Thus, we 
calculate a probability distribution which will on aver- 
age resemble the summation of a normally distributed 
number. Depending on the standard deviation a of the 
underlying normal distribution we get different distri- 
butions of the mutation rates p m {i), see Figure 0. For 
the efficient use of the new mutation operator we de- 
rived a numerical approximation of p m {i\o~) which is 



presented in (Bcndhoff et al. 1996) 
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Figure 3: Convergence plots of optimisation runs. 
The solid line shows the results obtained by position- 
dependent mutations. Dashed lines the ones from the 
canonical GA. (population size: 50; dimension: n = 
30; encoding length: 32 bits using Gray-code) 

The numerical estimates of the causality measure are 
P{A\B) « 0.73 and P{B\A) « 0.74. In order to sup- 
port our hypothesis that increasing the strong causal- 
ity in an optimisation process leads to an improved 
performance we apply the modified GA to two stan- 
dard optimisation problems. Results are given for the 
sphere model and Ackley's function, see figure 0. The 
new GA converges faster to a better value than the 
canonical GA. Figure || shows that in case of the sphere 
model the increase of convergence speed is of order 10 5 
and for the Ackley function of order 10 15 . 

4 Causality in Structure Optimisation 

The problem to choose the right genotype — > pheno- 
type mapping is of particular importance in the do- 



main of structure optimisation. We will here concen- 
trate on the structure optimisation of neural networks. 
We will regard the set of all possible connection ma- 
trices as the phenotype space, and allow the matrix to 
have entries from {0, 1} or from {0, ...,N sym }. Since 
there is no measure on the space of these matrices 
which relates directly to the performance of the neu- 
ral network without evaluating the network, we will 
use the standard Euclidian distance measure for the 
phenotype space, (y J denotes the entry at (row n, 
column k) of the matrix MA 



N N 



d E (M h M j ) = ±J2J2 



\Vnk 
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(18) 
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There is no a priori structure assumed for the network, 
hence the matrix is not constrained to any layered net- 
work structure. If y n k > there exists a connection 
between neuron n and neuron k. We are not restricted 
to the upper triangular part of the matrix, thus in prin- 
ciple feedback connections can be specified. If the y n k 
are restricted to the values {0, 1}, we only specify the 
connection between the neurons. If we extend the al- 
lowed values to all integers in the set {0, N sym }, it is 
possible to further define initial values for the weights 
and the thresholds. In connection with gradient de- 
scent algorithms for the fine tuning of the weights 
this approach has been successful, see ( Bendhoff et al 



1997 ), and we will therefore include it in the following 



examinations. 

There have been several proposals on how to organise 
the genotype space and the mapping / : Q — > V for the 
optimisation of the structure of neural networks, see 
also ( Whitley 1995| ) . Most of them can be categorised 
into three principal approaches, the direct encoding, 
the recursive or grammar encoding and the cellular 
encoding. The first attempts to use evolutionary (ge- 
netic) algorithms for structure optimisation employed 
the direct encoding method, ( Miller et al. 198Sj ) , and 
it probably still is the most frequently used method. 
The re cursive encoding has been introduced by Kitano 
( 1990 ) , in order to overcome the bad scaling behaviour 
of the direct encoding method for large networks and to 
favour a modular structure of the network. The third 
appro ach, the cellular encoding, proposed by Gruau 
( 1993 ), uses a tree representation of operators which 
construct the network. The structure of the tree and 
therefore of the network is optimised by genetic pro- 
gramming. In the following we will examine the direct 
encoding and the recursive encoding with respect to 
the proposed measure, eqs. (||, |7|). Therefore, we will 
examine whether the neighbourhood structure on Q 
is carried over to V; whether the system is strongly 
causal. We will restrict ourselves to the direction, 



Q — > V and we will not sample uniformly in Q. The 
reason is, that for the mutation operator p± (see eq. 
(19)) the probability to reach the genotype gj and g k 
from gi is zero for almost all uniformly sampled triples. 
Thus, if we want to examine the system (mapping, 
P±)i we sample gi uniformly and obtain gj and g k via 
mutation from <?; with the probability Pi n u ■ By tun- 
ing Pinit, we are at the same time able to determine 
how local the three chromosomes are. We then derive 
the probability to reach gj and gk from gi by "normal 
mutation" . 

4.1 The direct encoding method 



In the direct encoding method the chromosome con- 
sists of the whole connection matrix. Usually all ma- 
trix rows are concatenated to form the chromosome, 
whose elements we want to denote by x n . The range 
of allowed values for x and y can be {0, 1} or from 
the integer set {0, N sym }. The following operators 
(X € [0, 1[ is a uniformly sampled number) have been 
used 



p±x = 



PuX 



x + 1 
x - 1 



X < 0.5 
X > 0-5 

1)J 



(19) 
(20) 



p u replaces a; by a new integer with equal probability 
from the set {0, N sym }. We used the Euclidian mea- 
sure of distance for matrices on the phenotype space 
and a distance measure which only counts structural 
differences di(Mi,Mj) 
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The results for the probability P(A\B), that is the 
probability that {cIej denotes dE or dj) 

A := d E ,i(Mi, Mj) > d EJ {M u M k ) (22) 

holds in phenotype space, given that 

B := — log (P( 9i -» g )) > - log (P( 9i -> g k )) (23) 

is true in genotype space, are presented in Table |l|. 
The standard setting, [x £ {0, l},d/,p„) is strongly 
causal in the Q — » V direction. However, if the al- 
lowed values are extended to an interval of integers, 
all settings have problems at least for the structural 
distance measure. Thus, we conclude that even direct 
encoding methods are not strongly causal straightfor- 
wardly if we depart from the basic setting. 





ds/p± 


di/p± 
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di/Pu 


x e {0,1} 








0.0 


X £ {0, . . , N sym\ 


0.0 


0.614 


0.662 


0.564 



Table 1: Numerical estimation of the probabilities 
P(A\B), using combinations of the two different dis- 
tance measures and mutation operators. The probabil- 
ities have been estimated from 10 5 trials (N syrn = 10 
and pinit = 0.25). 
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Figure 4: One element is replaced by four elements 
in the recursion step via the small chromosome Sc — * 
large chromosome Lc mapping. 

4.2 The recursive encoding method 

In all encoding methods apart from the one discussed 
above a more or less intrinsic mapping is introduced 
from the genotype to the phenotype space. We al- 
ready argued why this is sensible and we now want to 
examine to what extent a recursive encoding method 
is strongly causal. The coding, described in (Sendhofl 



it al. 1997), consists of four chromosomes, where only 
the first two are important for the building process 
of the connection matrix. In each iteration step ev- 
ery element of the connection matrix is replaced by a 
2x2 matrix of new elements. The new elements are 
specified by a mapping from the small chromosome 
Sc to the large chromosome Lc- The length of the 
small chromosome Ns c is variable, the length of the 

4:-N Sc . At 



large one is fixed by the condition N^ c 
each step i the first place N(y nk ) of each connection 
matrix element y nk in Sc is determined; for example 
position N(yi = 7) = 3 in Figure [|. The element is 
then replaced by the four elements at the positions 

(4.(AT(t4)-l) + l, 4 • (N(yi k ) - 1) + 2, 
4 • (N(y nk ) - 1) + 3, 4 • (N(y nk ) - 1) + 4 ) (24) 

in the large chromosome Lc ■ Figure ^ shows the re- 
placement of an element y\ = 7 by the four elements 
(3,6,9, 1). In case y nk is not in Sc, it is replaced by 
four so called terminal symbols (in the notation of in- 
teger strings, the most convenient choice is zero). A 
terminal symbol is in turn always replaced by another 
four terminal symbols in an recursion step. Figure ^| 
shows the evolution of a 8 x 8 connection matrix M con 




Figure 5: Scheme of the recursive development of the 
connection matrix up to a size of 8 x 8. 

following the introduced rules. This network connec- 
tion matrix is a function of the mutation and crossover 
probabilities, the chromosome length ds c , the number 
of iteration steps N steps and of the size of the set of in- 
tegers {1, ...,N sym } of allowed values for both strings. 
We restrict ourselves to mutations on Sc and exam- 
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Figure 6: Since it is easier to visualise, the probabil- 
ity to violate the causality condition (1 — P(A\B)) is 
shown for the (a) Euclidian distance measure d,E and 
(b) structure distance measure c?/. The values have 
been estimated from 10 5 trials (pinit — 0.25). 

ine the probability (1 — P(A\B)) as a function of ds c 
and N sym , the results are shown in Figure || (a) for 
the Euclidian distance measure ^(Mj, My) and in (b) 
for the structure distance measure di(M i: Mj). Only 



the results for the p± mutation operator are shown, 
because the values for p u are only slightly lower and 
show a similar behaviour as in Figure |[ We note that 
the system is generally not strongly causal, especially 
for specific combinations (ds c ,N sym ). Furthermore, 
the best (lowest, since causality violations are shown) 
values on average are reached if the encoding param- 
eters ds c and N sym differ only slightly, thus we con- 
clude ds c ~ N sym . We also note that the differences 
between d% and di are only marginal both qualita- 
tively and quantitatively. Thus, from the point of view 
of causality the combined optimisation of the structure 
and the initial weight values seems to be sensible. 

We will now, similar to section [| in the domain of pa- 
rameter optimisation, try to lower the probability of 
causality violations with the help of a new position de- 
pendent mutation operator. Firstly, we have to iden- 
tify typical settings which are responsible for causality 
violations. One is a direct consequence of the redun- 
dant nature which from the viewpoint of accumulated 
mutation is also advantageous. Hence, we do not try 
to change the encoding to be less redundant, but in- 
stead we change the mutation operator, so that the 
mutation probability rises for redundant chromosome 
entries. If p m is the probability to mutate and N Xk rs a ) 
denotes the number of occurrences of symbol Xk(Sc) 
in Sc before the position of Xk(Sc), we write 

p m (xk(S c )) = Pm • (N Xk(Sc ) + 1) (25) 

Secondly, we observe that all elements from Sc which 
occur in the first four elements in Lc have a large im- 
pact on the connection matrix, since the first element 
in Sc is always mapped onto this first block of elements 
in Lc- Therefore, we suggest a second modification to 
the mutation operator: 

Pm(xk(S C )) = P m 

Vx k (S c ) S { Xl (Lc),...,x 4 (L c )} (26) 

Figure ^ (a) and (b) show the results for the probabil- 
ity of causality violating steps for the mutation oper- 
ator with the modifications eqs. f26| ) compared to 
the fixed mutation (dashed curve) rate p m . In Figure 
(a) we kept N sym constant and changed the length 
of Sc, since we expect that in this case modification 
( |25| ) will have the largest impact because the amount 
of redundant elements in Sc rises with ds c - Indeed, 
we observe that (1 — P(A\Bj) is considerably reduced 
and that the effect is more pronounced for larger val- 
ues of ds c - Figure fj] (b) shows experiments carried 
out for the combinations N sym = ds c which, as we 
pointed out earlier, are the best choices for the coding 
parameters. The new mutation operator reduces the 
probability of causality violations also in these cases, 
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Figure 7: The probability to violate the causality con- 
dition (1 — P(A\B)) (see also Fig. ||) estimated from 
10 5 trials, (a) N sym — 10 is kept constant and (b) the 
relation N sym = ds c is fixed. The interlaced curve 
shows the values for the mutation operator with the 
modifications (^5|, 26) and the dashed curve for the 
standard mutation rate p m (pi n u — 0.25). 

however the difference to the fixed mutation rate is 
smaller than in Figure 0(a). Thus, we conclude that 
minor modifications of the mutation operator can al- 
ready have an causality enhancing effect on the search 
process and that it is worthwhile to analyse the (geno- 
type — > phenotype, mutation) system with respect to 
the question why and for which specific settings prob- 
lems can occur. 

5 Conclusion 

In this paper we suggested a condition which the 
setting (genotype — > phenotype mapping, mutation) 
should fulfill in order to allow gradual changes for a 
local search on the phenotype space which can be con- 
trolled via the mutation parameter on the genotype 
space. We applied the probabilistic causality condi- 
tion to problems in the domain of parameter optimisa- 
tion and structure optimisation both analytically and 
constructively. Thus, besides examining the search 
process, we also suggested variations in the mutation 
operator to improve the setting with respect to our 
condition. Especially in the later domain, where com- 
plicated mappings are commonly used, we believe the 



measure could be a useful tool for constructing evolu- 
tionary algorithms. In the case of parameter optimisa- 
tion, Figures || show that the setting which enhances 
the causal behaviour at the same time improves the 
performance. 

Acknowledgements 

This research work is part of the BMBF SONN project 
under Grant No. 01IB401A9. 

References 

Gruau, F. (1993). Genetic synthesis of modular neural 
networks. In S. Forrest (Ed.), Proc. 5th Int. Conf. Genetic 
Algorithms, pp. 318-325. Morgan Kaufmann. 

Jones, T. and S. Forrest (1995). Fitness distance cor- 
relation as a measure of problem difficulty for genetic 
algorithms. In L. Eshelman (Ed.), Proc. 6th Int. Conf. 
Genetic Algorithms, pp. 184-192. Morgan Kaufmann. 

Kitano, H. (1990). Designing neural networks using ge- 
netic algorithms with graph generation system. Complex 
Systems 4, 461-476. 

Lohmann, R. (1993). Structure evolution and incomplete 
induction. Biol. Cybern. 69(4), 319-326. 

Manderick, B., M. de Weger, and P. Spiessens (1991). The 
genetic algorithm and the structure of the fitness land- 
scape. In R. Belew and L. Booker (Eds.), Proc. 4th Int. 
Conf. Genetic Algorithms, pp. 143-150. Morgan Kauf- 
mann. 

Miller, G. and P. Todd (1989). Designing neural net- 
works using genetic algorithms. In D. Schaffer (Ed.), Proc. 
3rd Int. Conf. Genetic Algorithms, pp. 379-384. Morgan 
Kaufmann. 

Rechenberg, I. (1994). Evolutionsstrategie '94- Friedrich 
Frommann Holzboog. 

Schuster, P. (1995a). Artificial life and molecular evolu- 
tionary biology. In F. Moran et al. (Eds.), Advances in 
Artificial Life, pp. 3-19. Springer. 

Schuster, P. (1995b). How to search for RNA structures 
- theoretical concepts in evolutionary biotechnology. J. 
Biotechnology 4U 239-257. 

Sendhoff, B. and M. Kreutz (1996). Analysis of possi- 
ble genome-dependence of mutation rates in genetic al- 
gorithms. In T. Fogarty (Ed.), Evolutionary Computing, 
Volume 1143 of LNCS, pp. 257-269. Springer. 

Sendhoff, B. and M. Kreutz (1997). Evolutionary opti- 
mization of the structure of neural networks using a re- 
cursive mapping as encoding. In Proc. 3rd Int. Conf. Arti- 
ficial Neural Networks and Genetic Algorithms. Springer. 

Sendhoff, B., M. Kreutz, and W. von Seelen (1997). 
Causality and the analysis of evolutionary algorithms. 
Submitted to IEEE Trans. Evolutionary Computation. 

Whitley, L. (1995). Genetic algorithms and neural net- 
works. In J. Periaux and G. Winter (Eds.), Genetic Algo- 
rithms in Engineering and Computer Science, Chapter 11. 
John Wiley. 



